iNextCube: Information Network-Enhanced Text Cube

نویسندگان

  • Yintao Yu
  • Cindy Xide Lin
  • Yizhou Sun
  • Chen Chen
  • Jiawei Han
  • Binbin Liao
  • Tianyi Wu
  • ChengXiang Zhai
  • Duo Zhang
  • Bo Zhao
چکیده

Nowadays, most business, administration, and/or scientific databases contain both structured attributes and text attributes. We call a database that consists of both multidimensional structured data and narrative text data as multidimensional text database. Searching, OLAP, and mining such databases pose many research challenges. To enhance the power of data analysis, interesting entities and relationships can be extracted from such databases to derive heterogeneous information networks, which in turn will substantially increase the power and flexibility of data exploration in such databases. Based on our previous studies on TextCube [1], TopicCube [2], and information network analysis, such as RankClus [3] and NetClus [4], we construct iNextCube, an information-Network-enhanced text Cube. In this demo, we show the power of iNextCube in the search and analysis of two multidimensional text databases: (i) a DBLP-based CS bibliographic database, and (ii) an online news database.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cube Index: A Text Index Model for Retrieval and Mining

Text retrieval, Analysis, Mining and Knowledge management have gained a lot of importance in a time when we drown in information but are starved for knowledge. In this paper, we propose a novel Index that uses a Text Cube model to store the text information similar to a data cube in Data Mining. This model creates a direct index, next word index and inverted index in a single Cube Index which i...

متن کامل

OPTIMIZING THE INFORMATION SPEED IN TELEMEDICINE NETWORK BY INCREASING THE SPEED OF NODES

<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...

متن کامل

OPTIMIZING THE INFORMATION SPEED IN TELEMEDICINE NETWORK BY INCREASING THE SPEED OF NODES

<span style="color: #000000; font-family: Tahoma, sans-serif; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; display: inline !important; float: none; ba...

متن کامل

M-Cube: A Visualization Tool for Multi-dimensional Multimedia Databases

The last decade was marked by a striking growth on database size and dimension. This increase is noticeable in many areas, ranging from personal data storage to large corporation databases. The size and high dimensionality of these data sets prompt the application of specialized graphical representations rather than tables and generic charts to visualize the data. This specialization limits the...

متن کامل

Doc2Cube: Automated Document Allocation to Text Cube via Dimension-Aware Joint Embedding

Data cube is a cornerstone architecture in multidimensional analysis of structured datasets. It is highly desirable to conduct multidimensional analysis on text corpora with cube structures for various text-intensive applications in healthcare, business intelligence, and social media analysis. However, one bottleneck to constructing text cube is to automatically put millions of documents into t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2009